Semantic Similarity of Short Texts
نویسنده
چکیده
This paper presents a method for measuring the semantic similarity of texts using a corpus based measure of semantic word similarity and a normalized and modified versions of the Longest Common Subsequence (LCS) string matching algorithm. Existing methods for computing text similarity have focused mainly on either large documents or individual words. In this paper, we focus on computing the similarity between two sentence or between two short paragraphs. The proposed method can be exploited in a variety of applications involving textual knowledge representation and knowledge discovery. Evaluation results on two different data sets show that our method outperforms several competing methods.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملSemantic Similarity Measure for Pairs of Short Biological Texts
Finding the semantic similarity between biological texts, specially short texts, such as article abstracts and experiment descriptions of microarrays, may throw important information for experts in that field. To date, these methods have not been widely explored. In this paper, a comparison of different measures to calculate the semantic similarity of pairs of short biological texts is presente...
متن کاملCorpus-based and Knowledge-based Measures of Text Semantic Similarity
This paper presents a method for measuring the semantic similarity of texts, using corpus-based and knowledge-based measures of similarity. Previous work on this problem has focused mainly on either large documents (e.g. text classification, information retrieval) or individual words (e.g. synonymy tests). Given that a large fraction of the information available today, on the Web and elsewhere,...
متن کاملA Fast Approach for Semantic Similar Short Texts Retrieval
Retrieving semantic similar short texts is a crucial issue to many applications, e.g., web search, ads matching, questionanswer system, and so forth. Most of the traditional methods concentrate on how to improve the precision of the similarity measurement, while current real applications need to efficiently explore the top similar short texts semantically related to the query one. We address th...
متن کاملImproving Semantic Similarity for Pairs of Short Biomedical Texts with Concept Definitions and Ontology Structure
Finding semantic similarity between short biomedical texts, such as article abstracts or experiment descriptions, may provide important information for health researchers. This paper presents a method for calculating text similarity in the biomedical context. The method implements a pairwise concept semantic similarity measure that uses concept definitions and ontology structure. The respective...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007